A fast algorithm for robust constrained clustering

نویسندگان

  • Heinrich Fritz
  • Luis Angel García-Escudero
  • Agustín Mayo-Iscar
چکیده

The application of “concentration” steps is the main principle behind Forgy’s kmeans algorithm and Rousseeuw and van Driessen’s fast-MCD algorithm. Although they share this principle, it is not completely straightforward to combine both algorithms for developing a clustering method which is not affected by a certain proportion of outlying observations and that is able to cope with non spherical groups or with groups with different weights. However, these approaches can be successfully combined by additionally controlling the relative cluster scatters in the concentration steps. In this way, the appearance of uninteresting spurious clusters is avoided. An algorithm which implements such “constrained concentration” steps in a computationally efficient way will be presented in this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

ROBUST RESOURCE-CONSTRAINED PROJECT SCHEDULING WITH UNCERTAIN-BUT-BOUNDED ACTIVITY DURATIONS AND CASH FLOWS I. A NEW SAMPLING-BASED HYBRID PRIMARY-SECONDARY CRITERIA APPROACH

This paper, we presents a new primary-secondary-criteria scheduling model for resource-constrained project scheduling problem (RCPSP) with uncertain activity durations (UD) and cash flows (UC). The RCPSP-UD-UC approach producing a “robust” resource-feasible schedule immunized against uncertainties in the activity durations and which is on the sampling-based scenarios may be evaluated from a cos...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

Stability Analysis and Robust PID Control of Cable Driven Robots Considering Elasticity in Cables

In this paper robust PID control of fully-constrained cable driven parallel manipulators with elastic cables is studied in detail. In dynamic analysis, it is assumed that the dominant dynamics of cable can be approximated by linear axial spring. To develop the idea of control for cable robots with elastic cables, a robust PID control for cable driven robots with ideal rigid cables is firstly de...

متن کامل

A Robust Knapsack Based Constrained Portfolio Optimization

Many portfolio optimization problems deal with allocation of assets which carry a relatively high market price. Therefore, it is necessary to determine the integer value of assets when we deal with portfolio optimization. In addition, one of the main concerns with most portfolio optimization is associated with the type of constraints considered in different models. In many cases, the resulted p...

متن کامل

Multiobjective Imperialist Competitive Evolutionary Algorithm for Solving Nonlinear Constrained Programming Problems

Nonlinear constrained programing problem (NCPP) has been arisen in diverse range of sciences such as portfolio, economic management etc.. In this paper, a multiobjective imperialist competitive evolutionary algorithm for solving NCPP is proposed. Firstly, we transform the NCPP into a biobjective optimization problem. Secondly, in order to improve the diversity of evolution country swarm, and he...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 61  شماره 

صفحات  -

تاریخ انتشار 2013